Feature Selection by Transfer Learning with Linear Regularized Models

نویسندگان

  • Thibault Helleputte
  • Pierre Dupont
چکیده

This paper presents a novel feature selection method for classification of high dimensional data, such as those produced by microarrays. It includes a partial supervision to smoothly favor the selection of some dimensions (genes) on a new dataset to be classified. The dimensions to be favored are previously selected from similar datasets in large microarray databases, hence performing inductive transfer learning at the feature level. This technique relies on a feature selection method embedded within a regularized linear model estimation. A practical approximation of this technique reduces to linear SVM learning with iterative input rescaling. The scaling factors depend on the selected dimensions from the related datasets. The final selection may depart from those whenever necessary to optimize the classification objective. Experiments on several microarray datasets show that the proposed method both improves the selected gene lists stability, with respect to sampling variation, as well as the classification performances.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space

We present a novel and flexible approach to the problem of feature selection, called grafting. Rather than considering feature selection as separate from learning, grafting treats the selection of suitable features as an integral part of learning a predictor in a regularized learning framework. To make this regularized learning process sufficiently fast for large scale problems, grafting operat...

متن کامل

Quantitative Structure-Activity Relationship Study on Thiosemicarbazone Derivatives as Antitubercular agents Using Artificial Neural Network and Multiple Linear Regression

Background and purpose: Nonlinear analysis methods for quantitative structure–activity relationship (QSAR) studies better describe molecular behaviors, than linear analysis. Artificial neural networks are mathematical models and algorithms which imitate the information process and learning of human brain. Some S-alkyl derivatives of thiosemicarbazone are shown to be beneficial in prevention and...

متن کامل

Learning Networks of Stochastic Differential Equations

We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval T . We analyze the l1-regularized least squares algorithm and, in the setting in which the underl...

متن کامل

Greedy RankRLS: a Linear Time Algorithm for Learning Sparse Ranking Models

Ranking is a central problem in information retrieval. Much work has been done in the recent years to automate the development of ranking models by means of supervised machine learning. Feature selection aims to provide sparse models which are computationally efficient to evaluate, and have good ranking performance. We propose integrating the feature selection as part of the training process fo...

متن کامل

Towards Feature Selection in Networks

Traditional feature selection methods assume that the data are independent and identically distributed (i.i.d.). In real world, tremendous amounts of data are distributed in a network. Existing features selection methods are not suited for networked data because the i.i.d. assumption no longer holds. This motivates us to study feature selection in a network. In this paper, we present a supervis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009